ViennaCL - Linear Algebra Library for Multi- and Many-Core Architectures
نویسندگان
چکیده
CUDA, OpenCL, and OpenMP are popular programming models for the multi-core architectures of CPUs and many-core architectures of GPUs or Xeon Phis. At the same time, computational scientists face the question of which programming model to use to obtain their scientific results. We present the linear algebra library ViennaCL, which is built on top of all three programming models, thus enabling computational scientists to interface to a single library, yet obtain high performance for all three hardware types. Since the respective compute backend can be selected at runtime, one can seamlessly switch between different hardware types without the need for error-prone and time-consuming recompilation steps. We present new benchmark results for sparse linear algebra operations in ViennaCL, complementing results for the dense linear algebra operations in ViennaCL reported in earlier work. Comparisons with vendor-libraries show that ViennaCL provides better overall performance for sparse matrix-vector and sparse matrix-matrix products. Additional benchmark results for pipelined iterative solvers with kernel fusion and preconditioners identify the respective sweet spots for CPUs, Xeon Phis, and GPUs.
منابع مشابه
ViennaCL - A High Level Linear Algebra Library for GPUs and Multi-Core CPUs
The vast computing resources in graphics processing units (GPUs) have become very attractive for general purpose scientific computing over the past years. Moreover, central processing units (CPUs) consist of an increasing number of individual cores. Most applications today still make use of a single core only, because standard data types and algorithms in wide-spread procedural languages such a...
متن کاملSolution of eigenvalue problems on heterogeneous computing architectures
In this paper are presented current achievements and the state-of-the-art algorithms and implementations for dense linear algebra on traditional architectures such as single-core machines or distributed memory parallel machines. Also, this paper summarizes the current implementations and publicly available libraries for basic linear algebra for multi-core and many-core architectures (e.g. graph...
متن کاملComparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware LAPACK Working Note ♯217
The emergence and continuing use of multi-core architectures require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) is a project that aims to achieve both high performance and portability across a wide range of multi-c...
متن کاملToward Scalable Matrix Multiply on Multithreaded Architectures
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for shared memory architectures with many simultaneous threads of execution, including SMP architectures and future multicore processors. The always-important matrix-matrix multiplication is used to demonstrate that a simple...
متن کاملMulti-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors
Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS (and LAPACK) functionality for many current multi-threaded architectures, the adaption of these libraries for asymmetric multicore processors (AMPs) is still pending. In this pap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Scientific Computing
دوره 38 شماره
صفحات -
تاریخ انتشار 2016